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We address the problem of learning an unknown unitary transformation from a finite number of 
examples. The problem consists in finding the learning machine that optimally emulates the exam- 
ples, thus reproducing the unknown unitary maximum fidelity. Learning a unitary is equivalent to 
storing it in the state of a quantum memory (the memory of the learning machine) , and subsequently 
retrieving it. We prove that, whenever the unknown unitary is drawn from a group, the optimal 
strategy consists in a parallel call of the available uses followed by a "measure-and-rotate" retrieving. 
Differing from the case of quantum cloning, where the incoherent "measure- and-prepare" strategies 
are typically suboptimal, in the case of learning the "measure-and-rotate" strategy is optimal even 
when the learning machine is asked to reproduce a single copy of the unknown unitary. We finally 
address the problem of the optimal inversion of an unknown unitary evolution, showing also in this 
case the optimality of the "measure-and-rotate" strategies and applying our result to the optimal 
approximate realignment of reference frames for quantum communication. 

PACS numbers: 03.67.-a,03.67.Ac,03.67.Hk 



I. INTRODUCTION 

A quantum memory would be an invaluable resource 
for Quantum Technology, and extensive experimental 
work is in progress for its realization T-'s'] . On a quantum 
memory one can store unknown quantum states. Can we 
exploit it to store an unknown quantum transformation? 
In this way we could transmit the transformation to a 
distant party by just transmitting a state, without the 
need of transferring the device. More generally, we could 
process the transformation with the usual state manip- 
ulation techniques, as noticed by Vidal, Masanes, and 
Cirac, who addressed the problem in Ref. [J. 

Storing-retrieving of transformations can also be seen 
as an instance of quantum learning, a topic which re- 
ceived increasing attention in the past few years (see e.g. 
Refs. for different approaches): Suppose that a 

user can dispose of N uses of a black box implementing 
an unknown unitary transformation U. Today the user is 
allowed to exploit the black box at his convenience, run- 
ning an arbitrary quantum circuit that makes N calls to 
it. Tomorrow, however, the black box will no longer be 
available, and the user will be asked to reproduce U on 
a new input state |'(/') unknown to him of her. We refer 
to this scenario as to quantum learning of the unitary U 
from a finite set of N examples. Generally, the user may 
be required to reproduce U more than once, i.e. to pro- 
duce M > 1 copies of U. In this case it is important to 
assess how the performance of learning decays with the 
number of copies required, as it was done in the case of 
quantum cloning Q. 



Let us consider first the M — N — 1 case. Clearly, 
the only thing we can do today is to apply the black box 
to a known (generally entangled) state \ip). After that, 
what remains is the state \ipu) = {U ^ 1)1^)1 that can 
be stored in a quantum memory. Then, when the new 
input state lip) becomes available, we send and l^pu) 
to an optimal retrieving channel, which emulates U ap- 
plied to |-0). If iV > 1 input copies are available, we 
must also find the best storing strategy: we can, e. g., 
opt for a parallel strategy where U is applied on TV dif- 
ferent systems, yielding ([/'^-'^ (g) I)\ip), or for a sequen- 
tial strategy where U is applied N times on the same 
system, alternated with other known unitaries, yielding 
{UVn-i . . . V2UV1U (8) The most general storing 

strategy is described by a quantum circuit board, i.e., a 
quantum network with open slots where the input copies 
can be inserted [lo| . In summary, solving the problem 
of the optimal quantum learning means finding the opti- 
mal storing board and the optimal retrieving channel. 

An alternative to coherent retrieval is to estimate U, 
to store the outcome in a classical memory, and to per- 
form the estimated unitary on the new input state. This 
incoherent estimation-based strategy has the double ad- 
vantage of avoiding the expensive use of a quantum mem- 
ory (which nowadays cannot store information for more 
than few milliseconds), — and of allowing one to repro- 
duce U an unlimited number of times with constant qual- 
ity. However, estimation-based strategies arc typically 
suboptimal for the similar task of quantum cloning 
and, by analogy, one would expect a coherent retrieval to 
achieve better performances. Surprisingly, we find that 
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whenever the unknown unitary is randomly drawn from 
a group the incoherent strategies already achieve the ulti- 
mate performances for quantum learning. In particular, 
we show that the performance of the optimal retrieving 
channel is equal to that of optimal estimation. For exam- 
ple, for a completely unknown qubit unitary the optimal 
fidelity behaves as = 1 — 0{N~^) asymptotically for 
large TV. Our result can be also extended to solve the 
problem of optimal inversion of the unknown U, in which 
the user is asked to perform W. In this case, we pro- 
vide the optimal approximate realignement of reference 
frames for the quantum communication scenario consid- 
ered by Ref. reaching the above asymptotic fidelity 
without ancilla. The paper is structured as follows: in 
Sec. In] we introduce the notation and the theoretical 
framework used to solve the problem of optimal learn- 
ing. The optimization is then presented in Sec. IIIIl by 
first addressing the case of a single output copy (Subsect. 
nil Ap . and subsequently showing how to generalize the 
argument to the case of A/ > 1 output copies (subsection 
IIIIBI) . In Sec. IIVI we discuss the problem of the optimal 
inversion of an unknown quantum dynamics, which can 
be regarded as a small variation of our learning problem. 
Sec. fVl concludes the article with a summary of the main 
results. 



II. NOTATION AND THEORETICAL 
FRAMEWORK 

To derive the optimal learning we use the method of 
quantum combs [9|, briefly summarized here. For more 
details and for an extensive presentation of the method 
we refer to Ref. [loj . 

Let Lin('H) denote the space of linear operators acting 
on the Hilbert space H, and Lin('H,/C) be the space of 
linear operators from H to K.. In the following we will use 
the one-to-one correspondence between bipartite vectors 
\A)) e /C (g) "H and linear operators A e Lin(H,/C) given 
by 

dim(K;) dim(-H) 
m— 1 n— 1 

where {\m)}'^^^^^ and are two fixed or- 

thonormal bases for /C and H, respectively. 

If A and B are two commuting operators in Lin('H) it 
is simple to derive from Eq. ([T} the equality 

(A^Ium) ^iIn®A^)\B)) , (2) 

where is the identity operator on H and A^ denotes 
the transpose of A with respect to the orthonormal basis 
{|^)}- 

A quantum channel C from Lin('H) to Lin(/C) is a com- 
pletely positive trace-preserving map, and is conveniently 
described by its Choi-Jamiolkowski operator, namely by 
the positive operator C G Lin(/C (g) H) defined by 

C^iC®InWn)){{In\), (3) 



where Xh is the identity map on Lin('H), and, accord- 
ing to Eq. ([T]), \I-h)} is the maximally entangled vector 

\In))^j:nT^\n)\n}en'^\ 

The composition of two channels is represented in 
terms of their Choi-Jamiolkowski operators by the link 
product d, [I^. Precisely, if 2? is a channel from /C to £, 
the Choi operator of the channel VoC resulting from the 
composition of C and V is given by the product 

D^C^ Tt^[{D In){Ic ® C^^)] , (4) 

with Tr^; denoting partial transpose on JC. Viewing states 
as a special kind of channels with one-dimensional input 
space, Eq. (g]) yields C{p) = C * p = Tt-h [C(/k; ® p"^)]. A 
channel C from "H to /C is trace preserving if and only if 
it satisfies the normalization condition 

I^^C = Tt^[C]^ In- (5) 

For two channels with multipartite input and output, 
one can decide to connect only some particular output 
of the first channel to some input of the second one: for 
example, if C is a channel from Lm{H(E)A) to Lm{JC(E)B) 
and 2? is a channel from Lm{A' (E)IC) to Lm{B'(E)C) we can 
connect the wires with the same label IC, thus obtaining 
the new channel (T) (g) Xb)(X4' (Xi C), which is a channel 
from Lm{A' (8)^(8)^^) to Lin(;B' (»£(»B). Accordingly, 
the connections of quantum channels in a network will 
be encoded in the labels assigned to the Hilbert spaces: 
whenever two spaces have the same label, two channels 
acting on these spaces will be connected, and their Choi- 
Jamiolkowski operators will be contracted with the link 
product as in Eq. ([4]). 

Remark (reordering of Hilbert spaces and com- 
mutativity of the link product). Encoding the con- 
nections in the labeling of the Hilbert spaces turns out 
to be very convenient in the treatment of multipartite 
quantum networks, because some formulae take a much 
simpler form if we suitably rearrange the ordering of 
the Hilbert spaces in the tensor product. For exam- 
ple, it may be convenient to rewrite the tensor product 
^'i=i^^ Hi putting all spaces with even labels on the left 
and all spaces with odd labels on the right. This re- 
ordering can be done safely as long as different Hilbert 
spaces have different labels. Note that the link product 
of two Choi-Jamiolkowski operators is commutative up 
to this reordering of Hilbert spaces: for example, given 
two operators C G Lin(/C (8 Ti) and D £ Lin(£ (X) K.) with 
•H ~ /C ~ £, we have D*C = SWAP (C * D) SWAP, 
where SWAP is the operator that exchanges the Hilbert 
spaces C and H in the tensor product C®T-L. The reader 
should not be confused by fact that the link product 
is commutative (up to reordering of the Hilbert spaces) 
whereas the composition of channels is not (C o I? is in 
general different from VoC). The fact that the output 
of C is connected with the input of 2? (and not the other 
way round) is encoded in the fact that the output space 
of C has the same label of the input space of V (here 
they are both labeled as IC). In order to express the dif- 
ferent composition of channels corresponding to C o I? we 
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would have had to choose a different labeling, in which 
the output of T) is identified with the input of C 

A quantum circuit board is the quantum network re- 
sulting from a sequence of multipartite channels where 
some input of a channel is connected to some output of 
the previous one, as we just illustrated. A quantum comb 
is the Choi-Jamiolkowski operator associated to a quan- 
tum circuit board, and is obtained as the link product 
of all component channels. The fact that the the circuit 
board represents a sequence of (trace-preserving) chan- 
nels is expressed by a set of linear equations , and, 
therefore, optimizing a quantum circuit board is equiv- 
alent to optimizing a positive operator subject to these 
linear constraints. The constraints will be given explic- 
itly for the case of learning in the next section. 



III. 



OPTIMIZATION OF LEARNING 



In this section we show that the optimal quantum 
learning of an unknown unitary randomly drawn from 
a group has a very simple and general structure: (i) in 
order to store the unitary it is enough to apply the avail- 
able examples in parallel on a suitable entangled state, 
(ii) the optimal state for storage has the same form of an 
optimal state for estimation of the unknown unitary, and 
(Hi) the optimal retrieval can be achieved via estimation 
of the unknown unitary, namely by measuring the quan- 
tum memory, producing an estimate for the unknown 
unitary, and finally, applying the estimate M times. 



A. The M = 1 case 

We tackle the optimization of learning starting from 
the case where a single output copy is required. Referring 
to Fig. [U we label the Hilbert spaces of quantum sys- 
tems according to the following sequence: i'H2n+i)n=o 
are the inputs for the N examples of U, and i'H2n+2)n=o 
are the corresponding outputs. We denote by Hi — 

0n=O ^2n+l ( 'Ho 



0^ ■H2„+i ( Ho = {8>«=o^ "^2,1+2) the Hilbert spaces 
of all inputs (outputs) of the N examples. The input 
state belongs to 'H2N+2, and the output state finally 
produced belongs to H2N+3- AH spaces Tin considered 
here are d— dimensional, except the spaces Hq and 'H2N+1 
which are one-dimensional and are introduced just for 
notational convenience. The comb of the whole learning 
process is an operator L > on the tensor of all Hilbert 
spaces and satisfies the normalization condition [l^ : 

Trafe+iiL^'^)] =/2fe(»L(^-i) k = . . . ,N +1 (6) 

where L(^+i) = L, L^-i) = 1, and L^*^) is a positive 
operator on the spaces {T-Ln)^=Q ■ When the N examples 
are connected with the learning board, the user obtains 
a channel Cu with Choi operator given by 

Cu^L*\U)){{Ur'' 

= Tr,,o [L {I2N+3 ® I2N+2 ® (|C/))((C/|^^)^)] , 
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FIG. 1: The learning process is described by a quantum comb 
(in white) representing the storing board, in which the N uses 
of a unitary U are plugged, along with the state ji/') (in gray). 
The wires represent the input-output Hilbert spaces. The 
output of the first comb is stored in a quantum memory, later 
used by the retrieving channel IZ. 



as it follows from the definition of link product in Eq. 

As the figure of merit we maximize the fidelity of the 
output state Cu{\'4'){'4'\) with the target state C/|?/')(V'|C^^i 
uniformly averaged over all input pure states and all 
unknown unitaries U in the group G. Apart from ir- 
relevant constants, such optimization coincides with the 
maximization of the channel fidelity between Cu and 
the target unitary {i.e. the fidelity between the Choi- 
Jamiolkowski states Cu/d and \Uj){{U\/d) averaged over 
U: 



F^l I Tr{L[|C/))((C/|®(|C/))((C/r)^]}d[/ 



(8) 



U* being the complex conjugate of U in the computa- 
tional basis, and d U denoting the normalized Haar mea- 
sure. From the expression of F it is easy to prove that 
there is no loss of generality in requiring the commutation 

[L, U2N+3 ® v^2V+2 (u* v)^^] = yu,v eG . 

(9) 

Moreover, using Eq. ([6]) for fc = + 1 we obtain 
Trwajv+al-^] = I2N+2 (8) where L*^^' is a positive 

operator acting on ^"^^^q^ Hn (recall that, however, "Ho 
and 'H2N+1 are one-dimensional). Reordering the Hilbert 
spaces in the tensor product by putting all input spaces 
of the examples on the right and all output spaces on the 
left and using Eq. ^ we then get 



(10) 



Here the subscripts i, o recall that U'^^ acts on the tensor 
product of all output spaces Ho = ^n=o'^2n+i, while 
acts on the tensor product of all input spaces Hi = 
^n=o ^2ji+i- This leads to the following 

Lemma 1 (Optimality of parallel storage) The op- 
timal storage ofU can be achieved by applying U^^'^lf'^ 
on a suitable input state \ip) € Hq Hi. 

Proof. According to Fig. [Tl the learning board £ re- 
sults from the connection of the storing board S with 
the retrieving channel TZ. In terms of the corresponding 
Choi-Jamiolkowski operators L, S, R, respectively, one 
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has L = R* S. Denoting by Hm the Hilbert space of the 
quantum memory in Fig. [TJ we have that 7?, is a channel 
from {'H2N+2 ® 'Hm) to 'H2N+3, and satisfies the normal- 
ization condition I2N+3 * R = I2N+2 €5 Im- Using this 
fact, one gets Tr2N+3[L] = I2N+3 *L= {I2N+3 *R)*S ^ 
{I2N+2 ® Im) * S = I2N+2 <X) TrA/[S'], which compared 
with Eq. ® for /c = iV+ 1 implies Tvm[S] = L^^\ Now, 
without loss of generality we take the storing board S 
to be a sequence of isometries 0, [13 , which implies that 
S is rank one: S = |<i>))((<i>|. With this choice, the state 
S/d^ is a purification of L^^-' / . Again, one can choose 
w.l.o.g. S/d^ to be a state on {Uo ® %{) ® {H'o ® 'H'i), 
with n'^ ~ Uo and ~ -H, and assume |$)) = 
Taking y = / in Eq. ([TU]) and using Eq. ^ we get 

(C/o^^ ® I^,o',^') m) = {lo. ® Uj,^" ® m- When 
the examples of U are connected to the storing board, 
the output is the state pu = S * |C/))((C/|f|f . Using the 
above relation we find that pu is the projector on the 
state \^u) = {Uf,'' where |^) = ((/«^|o,.|<I>)) e 
'Ho'®'Ri' — 'Ho®'Ri- This proves that every storing board 
gives the same output that would be obtained with a par- 
allel scheme. In other words, every storing board can be 
simulated applying [Uf^^lf^] to a suitable input state 

\ip) e no<»m. m 

Optimizing learning is then reduced to finding the 
optimal input state \ip) and the optimal retrieving 
channel TZ. The fidelity can be computed substi- 
tuting L = R * S in Eq. ^ and using the re- 
lation {{U\{{U*\^^{R * S)\U))\U*))^^ = {{U\R\U)) * 
((;7*p^5|(7*r^ = {{U\R\U}} * \vu){'Pu\, which gives 



^= ^2 / {{U\{v*u\R\U)Wu) dU. 



Lemma 2 (Optimal states for storage) The opti- 
mal input state for storage can be taken of the form 



e 

jelrr(;7»") 




e n , 



(12) 



where Pj are probabilities, the index j runs over the 
set Iri{U®^) of all irreducible representations {Uj} 
contained in the decomposition of {U®^}, and H = 
®j(zi„(^U<i>N^{'H.j®'Hj) is a subspace of'Ho®'Ri carrying 

the representation U = 0jgirr(u«"^) (^i ® -^i)' being 
the identity in Hj . 

Proof. Using Eqs. ([2]) and (|TOl) it is possible to show 
that the marginal state p = Tri[|(/3)((/?|] is invariant 
under U®^ . Decomposing [7®^ into irreducible rep- 
resentations (irreps) we have U^^ — ^j{Uj /m^ ), 
where Imj is the identity on an -dimensional mul- 
tiplicity space C"^ . Therefore, p must have the form 
p = ®jPj{Ij/dj ® Pj), where pj is an arbitrary state 
on the multiplicity space C™^. Since \if) is a purifi- 
cation of p, with a suitable choice of basis we have 

I i 

l<P) = |P^)> = ®.VPj/dj \Ij))\p^)), which after stor- 



age becomes \ipu) = ®jVPj/dj\Uj})\Pj)). Hence, for 



every U the state \(pu) belongs to the subspace H = 

e,(Hf ®ip|)))^e,Hf . ■ 

We can then restrict our attention to the subspace "H, 
and consider retrieving channels TZ from {V.2N+2®T{) to 
'H2N+3- The normalization of the Choi operator is then 



Tr2Ar+3[i?] = I2N+2 ® I-H 



(13) 



Combining the expression of the fidelity ([S]) with that of 
the input state ([H]), it is easy to see that one can always 
use a covariant retrieving channel, satisfying 



R,U2N+3®v;i^^^®U*V' 



V[/, u e G (14) 



where V' ~ ®j{Ij ® ^j) acts on T-L. We now exploit the 
decompositions U ®U* = ®k&-c-[(U(SU*) 
and V* Vj = 0igi„(v»v,) ( 



® -^™<J) ) ' which yield 



U2N+3(®V2N+2(®U*V = ^{UK(E>Vl(g)Ira,,J ■ (15) 



K,L 



Here is given by = 0jep^,^ (/„0) «) I^u) 

where Pkl is the set of values of j such that the irrep 
Uk <8) VI is contained in the decomposition of [/ (g) U* (g) 
U* (g) Vj . Relations IHl) and HH]) then imply 



R^^ilK<E}lL<E}RKL) , 



(16) 



K,L 



(11) where Rkl is a positive operator on the multiplicity 



space C™'^- = 0jgp^^.^ (C"^' 



Moreover, us- 



ing the equality I ® Ij = ®i^{Ik ® I u) ) we obtain 



e e 

j KeIiriU(»UJ) 



(17) 



e e 



\Ik))\Iu,)) 



= 0|/k))K') , 

K 

where \Ik)) e Uf? and ja^) e C""''^ is given by 



jePKK 



(18) 



Exploiting Eqs. and (d?]), the fidelity can be 

rewritten as 



K 



d2 



\aK\RKK\aK) 



(19) 
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Theorem 1 (Optimal retrieving strategy) The op- 
timal retrieving of U from the memory state \ipu) is 
achieved by measuring the ancilla with the optimal 
POVM Pfj = |r;^)(77^| given by \ri^) = ®^^,\U,)), 

and, conditionally on outcome U , by performing the uni- 
tary U on the new input system. 

Proof. Let us denote by P^]^ the projector on the ten- 



p'HRklP^II 



SOT product C™^' ® C"^" , and by r''^{ 
the corresponding diagonal block of Rkl- Using Schur's 
lemmas and Eq. (1161) we obtain 



Tr. 



2JV+3 



K.Lj<£Pkl 



(20) 

Equation (fT3|) then becomes ^m^i) = 
K = L implies the bound 



exploit the expressions for the optimal states and fideli- 
ties that are known in most relevant cases. For example, 
when U is an unknown qubit unitary in SU(2), learn- 
ing becomes equivalent to optimal estimation of an un- 
known rotation in the Bloch sphere ^13j . For large num- 
ber of copies, the optimal input state is given by \lp) « 
2^^^^^ |/,», with = 0(1/2) for N 
even (odd), and the fidelity is « 1 - ir'^/AN'^. Remark- 
ably, this asymptotic scaling can be achieved without us- 
ing entanglement between the set of qubits that are 
rotated and an auxiliary set of A'^ rotationally invariant 
qubits: the optimal storing is achieved just by applying 
jjtgiN -j-j^g optimal A^-qubit state [l^. Another example 
is that of an unknown phase-shift U = exp[i6az\. In this 
case, for large number of copies the optimal input state is 

\^) = v/2/(A^ + l)E,Hjv/2«in[^(m + l/2)/(iV + l)]|m) 

and the fidelity is F 1-2tt'^ /{N + lf Again, the 
optimal state can be prepared using only N qubits. 



-, d^m 



K 



(21) 



For the fidelity (|T9|) we then have the bound 
dK 



F 



E ^2 E 



K 



jJ'ePKi. 



djdf 



{{I„MRkk\Iu',)) (22) 




do 



Fe.. 



(23) 



(24) 



having used the positivity of Rkk for the first bound 
and Eq. (PT|) for the second. Regarding the last 
equality, it can be proved as follows. First, the Choi 
operator of the estimation-based strategy is i?cst = 
Ig\U)){{U\ ® |r,t)(^t|d;7. Using Eq. ^ with \ip*) 
replaced by jr/j) and performing the integral we obtain 
^cst = ©K(^f' ® RKK)/dK, where Rkk = \Pk){(3k\, 



'djll (j) 



F„ 



E 

K 

E 

K 



. Eq. then gives 
\{o^kW)? 



d2 



4^ ^3 



(25) 



d2 



B. Generalization to the M > 1 case 

Our result can be extended to the case where the user 
must reproduce M > 1 copies of the unknown unitary 
U . In this case, there are two different notions of opti- 
mality induced by two different figures of merit, namely 
the single-copy and the global fidelity. In the following 
we will examine both cases. 



Optimal learning according to the single-copy fidelity 



Let Cjj be the M-partitc channel obtained by the user, 
and C^^^ be the local channel c[r^^(p) = Tri[Cc/(p (g) S)], 
where p is the state of the first system, E is the state of 
the remaining M —1 systems, and Trj denotes the trace 
over all systems except the first. The local channel c[/^ 
describes the evolution of the first input of Cu when the 
remaining (Af — 1) inputs are prepared in the state E. Of 
course, the fidelity between C^/^ ^"^^ t;he unitary U can- 
not be larger than the optimal fidelity Fost of Eq. (|24p . 
and the same holds for any local channel Cfj^,, in which 
all but the i-th input system are discarded. Therefore, 
the measure-and-prepare strategy presented in Theorem 
[1] is optimal also for the maximization of the single-copy 
fidelity of all local channels, and such fidelity does not 
decrease with increasing M . 



2. Optimal learning according to the global fidelity 



The above theorem shows that the optimal state for 
storing U is identical to the optimal state for estimating 
it [l2l |. and, moreover, that the fidelity of unitary learn- 
ing with M = 1 is precisely the fidelity of unitary estima- 
tion. Having reduced learning to estimation, we can then 



The results of subsection IIII Al can be extended to the 
maximization of the global fidelity between Cjj and JJ^^ , 
just by replacing U with JJ'^^ in all derivations. Indeed, 
the role of the target unitary U in our derivations is com- 
pletely generic: we never used the fact that the unitary 
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emulated by the machine was equal to the unitaries pro- 
vided in the examples. Therefore, following the same 
proofs of subsection IIII Al it is immediate to see that also 
for the case of M > 1 copies with global fidelity the 
optimal strategy for storing consists in the parallel ap- 
plication of the examples on an input state of the form 
of Lemma [5] and that the optimal strategy for retrieving 
consists in measuring the optimal POVM P^y and in per- 
forming U^^'^ conditionally on outcome U. Therefore, 
also in this case optimal learning is equivalent to optimal 
estimation: precisely, the optimal learning is achieved by 
the estimation strategy that maximizes the expectation 
value of the goal function fniU, U) = (| Tr[[/^C/] 
given by {Jm) = /df/JdJ/ /m(C/,?7) {MPuWu)- 
Note that in this case the coefficients {pj} in the optimal 
input state of Lemma [5]) generally depend on M . 

Remark (generalization to nonidentical group 
representations). Since we never used the fact that 
the N examples are identical, all the results of Subsect. 
IIII Al hold even when the input (output) uses are not iden- 
tical copies V^^ iU®^), but generally (M) different 
unitaries, each of them belonging to a different represen- 
tation of the group G. For example, if G = SO (3) the A^ 
examples may correspond to rotations (of the same angle 
and around the same axis) of A^ quantum particles with 
different angular momenta. Of course, the same remark 
also holds when the M output copies. 

IV. OPTIMAL INVERSION OF AN UNKNOWN 
UNITARY EVOLUTION 

We now extend our results to the optimal inversion of 
an unknown unitary U: in this case the goal is not to 
produce M copies of J7, but, instead M copies of its in- 
verse . For this task the fidelity of the learning board 
is F' = l/d2 /^(([/t I I ®w ^, \W))'»Mp*J^^N ^jj^ 

as obtained by substituting U with JJ^'^^'^ in the target 
of Eq. ([H]). From this expression it is easy to see that one 
can always assume [V , V'^'^' ®U*'^'^' ®U*'^^ ®V^^] = 0. 
Therefore, the optimal inversion is obtained from our 
derivations by simply substituting U2N+3 — >■ 1/®*^ and 
V2N+2 — >■ U®^^ . Accordingly, the optimal inversion is 
achieved by measuring the optimal POVM Pjy on the op- 
timal state \ fu) and by performing C/^^*^ conditionally 
on outcome U . This provides the optimal approximate 
realignement of reference frames in the quantum commu- 
nication scenario recently considered in Ref. fll'], prov- 
ing the optimality of the "measure-and-rotate" strategy 
conjectured therein. In that scenario, the state ^Ti, 
serves as a token of Alice's reference frame, and is sent 
to Bob along with a quantum message ji/') G . Due 
to the mismatch of reference frames, Bob receives the de- 



cohered state = \(fu){(fu\ (g) U\ip){ip\W dU, from 
which he tries to retrieve the message lip) with maxi- 
mum fidehty f — J dip {ip\TZ' {a^)\ilj) d tp , where TZ' is 
the retrieving channel and d ip denotes the uniform prob- 
ability measure over pure states. The maximization of / 
is equivalent to the maximization of the channel fidelity 
F' = J^{{W\{iplj\R'\W))\iplj) dU, which is the figure of 
merit for optimal inversion. It is worth stressing that 
the state \ip) that maximizes the fidelity is not the state 
Iviik) = ®i\/dj/L\Ij)), L = ^jd^j that maximizes the 
likelihood 15|. For M = 1 and G = SU{2),U{1) the 
state gives an average fidelity that approaches 1 as 
while for \ipm) the scaling is 1/N. On the other 
hand, Ref. [ll| shows that for M — 1 l^puk) allows a 
perfect correction of the misalignment errors with prob- 
ability of success p = 1 — 3/ (A^ -1-1), which is not possible 
for The determination of the best input state to 

maximize the probability of success, and the study of 
the probability /fidelity trade-off remain open interesting 
problems for future research. 



V. CONCLUSIONS 

In conclusion, in this paper we found the optimal 
storing-retrieving of an unknown group transformation 
with A^ input and M output copies, proving the optimal- 
ity of the incoherent "measure-and-rotate" strategy, in 
strong contrast with the case of quantum cloning. The 
result has been extended to the optimal inversion of U, 
with application to the optimal approximate alignment 
of reference frames for quantum communication. An in- 
teresting development of this work is the analysis of op- 
timal learning when the unknown unitaries do not form 
a group. This would be the case, for example, of the 
optimal learning of the unknown unitary transformation 
appearing in Grover's quantum search algorithm. The 
question whether coherent quantum strategies can lead 
to an improvement in these cases remains open and worth 
investigating. 
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